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We investigate the searchability of complex systems in terms of their interconnectedness. Associating search- 
ability with the number and size of branch points along the paths between the nodes, we find that scale-free net- 
works are relatively difficult to search, and thus that the abundance of scale-free networks in nature and society 
may reflect an attempt to protect local areas in a highly interconnected network from nonrelated communica- 
tion. In fact, starting from a random node, real-world networks with higher order organization like modular or 
hierarchical structure are even more difficult to navigate than random scale-free networks. The searchability at 
the node level opens the possibility for a generalized hierarchy measure that captures both the hierarchy in the 
usual terms of trees as in military structures, and the intrinsic hierarchical nature of topological hierarchies for 
scale-free networks as in the Internet. 

PACS numbers: 89.75.Fb, 89.70,+c 



I. INTRODUCTION 



Each element interacts directly only with a few particular 
elements in most complex systems. Distant parts of the net- 
work thereby formed can consequently communicate through 
sequences of local interactions. In this way all parts of the net- 
work can be reached from other parts, but not all such commu- 
nications are equally easy or accurate llj|2l|3|,|4|]. The purpose 
of this paper is to investigate the interplay between searchabil- 
ity of a network and the network structure. By searchability 
or navigability we mean the difficulty of sending a signal be- 
tween two nodes in a network without disturbing the remain- 
ing network. We use a city-street network to illustrate the con- 
cept of navigability in networks |5]. As in Fig. [He) the streets 
are identified as nodes and intersections between the streets as 
links between the nodes. From this point of view, the above 
statement reads: A pedestrian or driver on a street in a city, 
can by multiple choices reach any other street in the city via 
the intersections. However, not all streets are as easy to find, 
and the difficulty of finding a street may vary from city to city. 



In the current paper we investigate how different network 
topologies influence the average amount of information that 
is needed to send a signal from one node to another node in 
the network. We consistently concentrate on specific signal- 
ing, and focus only on locating one specific node without dis- 
turbing the remaining network. This is different from the non- 
specific broadcasting where any input is amplified by all exit 
links of every node along all paths like in spreading of spam 
or propagation of diseases and computer viruses SH. We 
present a quantification of the specific signaling and justify 
our choice of measure by its minimum information property. 




FIG. 1 : (Color online) An example where the search information is 
an important concept, (a) illustrates a visitor's perspective of an un- 
known city. The visitor therefore asks a citizen with the perspective 
(b) of the city, or rather the higher abstraction level (c). This level 
is the dual map of the city, a network where streets are identified as 
nodes and intersections between streets as links between the nodes. 
We use this level to quantify the search information in (d); the av- 
erage number of yes-no questions the visitor must ask the citizen to 
find a specific street. The necessary information to walk the shortest 
path from s in the lower right corner to t in the upper right corner is 
log 2 36 bits or roughly 6 yes/no-questions. 



II. SEARCH INFORMATION 
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We consider a specific signal, or a walker, on a network 
and assume that the specific signal from a source s to a target 
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t is a signal that travels along the shortest path, and thereby 
minimizes the disturbance on other nodes. This assumption 
is made on the basis that the shortest path is a good estimate 
for typical traffic in a network |8]. We will later discuss the 
alternative model, to follow the minimal information path, not 
necessarily coinciding with the shortest path. The minimal 
amount of information needed to follow a specific shortest 
path is determined by the degrees of the nodes along the path, 
i.e., the number and size of the branch points between the 
nodes. That is, a walker on the network first has to choose 
the right exit link (we call it the path link) among the k s pos- 
sible links from s. The cost depends on the available global 
information on the node and the way the information is orga- 
nized at the node Hflll . If no information is available, the 
choice must be random, and the walker will perform a random 
walk. We here consider scenarios where the network repre- 
sents the communication backbone of a system with available 
information on the node level, e.g., social networks 10, 01, 
computer networks 1 10], city networks ra lllll . etc. In princi- 
ple, if the exit links are unordered, one yes-no question must 
be asked for every exit link to find the path link. On average 
this would give rise to an average cost ofk/2 yes-no questions 
or (A: — 1) /2 if the arrival link at the node is known and one 
link immediately can be excluded. This is illustrated in Fig. 

He). 




FIG. 2: (Color online) The information cost at each node depend on 
the ordering of the link, (a) The information cost does not depend on 
the degree if there is only one possible link, (b) It is possible to ask 
yes-no questions and successively eliminate groups of wrong exits if 
the links are ordered. Every yes-no question optimally reduces the 
number of possible links with 1 /2 and the cost is log 2 (k— 1) to find 
the correct exit link, (c) If the links are unordered such groupings are 
impossible and every exit link must be considered. In such a scenario 
the average number of necessary yes-no questions is (k— l)/2. 

The other extreme situation is when the exit path somehow 
is given by default or the information cost can be neglected in 
comparison to the walk on the shortest path itself [Fig.|2ja)]. 
We here focus on the case where the links are ordered, like 
intersections along a road. In this case a question can be used 
to reduce the possible outcomes by a factor 2. A city example: 
The yes-no answer to "Does anyone of the eight closest roads 
lead toward the station?" reduces the outcome to eight roads 
if there were 16 possible intersecting roads to choose from. 
The total number of bits, or roughly the number of yes-no 
questions, necessary to find the path link is log 2 (k) or log 2 (k — 
1 ) if the arrival node is known to not lead to the target as in 
Fig. 0b). 

That is, log 2 (k s ) bits of information are necessary at the 



start node s, where k s is the degree of s. Subsequently the 
walker at each node j £ p(s,t) along the path p(s,t) has 
to choose the particular exit link along the path. Given the 
knowledge to follow the path to j, there are kj — 1 unknown 
exit links from j, and the information needed to make the next 
step is log 2 (£/ — 1). As a result the total information needed 
to follow the path is 

S u (p(s,t)) = loga(*,)+ £ log 2 (fc;-l), (1) 

where p(s,t) includes nodes on the path between s and f, but 
not the start and end nodes s and t [see Fig.[2a)]. We use the 
notation S u to emphasize that the walk is a result of decisions 
for a specific and unique path and repeat that we use log 2 (kj — 
1 ) at every step but the first since the link of arrival is known 
(Fig.0. 

If there is more than one shortest path between s and t the 
information needed to travel along one of the shortest paths 
has to include the thereby added degenerate possibilities 1 1]. 
Degenerate paths imply that more than one exit link can lead 
the walker closer to the target from each node, and should 
be reflected in a decreased path information Sd{{p(s,t)})\ the 
subscript d is for degenerate paths and {p(s, t)} is for the set 
of paths between s and t. If a node j has kj links, of which re- 
links point toward the target node t, then the number of bits to 
locate one of the correct exits is reduced to log 2 [(fc; — 1)/t|;] 
[and to log 2 (fc s /r| 4 ) for the first step at the source node s]. In 
this definition we make the assumption that the probability 
of choosing any exit link on a shortest path from the current 
node is equal. Therefore each of the degenerate paths will be 
selected with a different probability, as indicated in the exam- 
ple of Fig. |3jb). That is, each path in the set of degenerate 
paths {p(s,t )} is selected with probability 
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The average number of bits needed to follow a random short- 
est path is accordingly 



S d (s^t) = £ P(p(s,t))-]Dg 2 



k k ■ 

o 



1 



II 



0) 



This simplifies to Eq. in the case where there is only one 
degenerate path. When there are degenerate paths between s 
and t, Sd(s — ► t) does not distinguish paths that are difficult to 
follow from the easier ones, but just averages. 

The average path information S c t(s — » f) is closely related 
to the earlier introduced search information |5, 12, 13l[3: 



S(s~^t) = -log 2 



i 



(4) 



where the sum runs over the set {p(s,t)} of degenerate short- 
est paths between s and t . Thus, again, if there are no degen- 
erate shortest paths, S(s —*t) = S u (s — > t). If there are degen- 
erate paths, the relative weighting of these paths differs. In 
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FIG. 3: (Color online) Search information with degenerate paths between the source s and target t. The numbers around the nodes indicate 
with what probability the link is chosen on the walk to /. The boldfaced number is the information cost in bits at the given node. The total 
information cost S(s —> t) above every network is given by the average cost over all paths marked with black lines (the sum of the costs at 
the nodes along the paths) weighted with the probability to walk the path (width of black lines). We present three scenarios, (a) The walker 
aims to take a specific shortest path in, the cheapest informationwise. (b) The walker chooses between two exits, both on the shortest path, 
randomly. This results in a lower total information cost since a random choice does not cost any information, even though some walks will 
go through the expensive hub to the right (2.3 bits), (c) The walker chooses to minimize the average information cost between s and t. The 
difference between (b) and (c) is clear from the choice at node j. In (c) more information is used at this node to avoid the higher cost of going 
to the hub to the right. This is completely avoided in (a) by going to the left at s, but at a higher information cost. 



the Sd measure each path is weighted according to the branch- 
ing of shortest path shown in Fig.|5Jb), and is thus the typical 
information needed to follow a random branch of one of the 
shortest paths through the network. In contrast S measures the 
minimal information value of knowing the full path and the 
subscript m for minimal is omitted. S is defined as — log 2 of 
the probability that a nonguided signal emitted from s arrives 
at t with minimal number of steps. For all networks we have 
tested Sd is maximally a few percent larger than S, reflecting 
that situations where one of the branches is substantially more 
difficult to travel only gives a small additional correction to Sd 
(see Fig. |4}. Also, we always found indistinguishable results 
when we analyzed the networks in terms of the conditional 
uniform test S — S (random) or in terms of Sd — Sd (random) 

MM- 

To get the corresponding probabilities to follow a given 
path as in Eq. (0 we present a simple example of the mini- 
mum information property of S, and choose the path from j to 
t in Fig. |3Jc) as an example path. Let the probability to take 
the left path be q\ and the right path via the hub be g 2 = 1 — q\ 
and further the probability to reach the target be p\ after the 
left choice is taken and p2 if the right choice is taken. The 
probability to choose the link down to the left is 0, since it 
is not on a shortest path to t. Then the total information cost 
from j to t is 

Sij^t) = (\og 2 3 + q { \Qg 2 qi+q 2 \og 2 q2)-{q\\og 2 pi+q2 

where the first parantheses on the right-hand side is the infor- 
mation cost to pay at node j. The full expression of this term 
reads 

3 1 1 3 

E~T lo &(:r)-L~'7' lo g2(<7;), (5) 

i=i i i i=i 

and is the difference between the information entropy of a ran- 
dom choice and the information entropy of the actual choice — 
the meaningful information of the choice. The information 



cost payed at node j ensures that the walker takes the path to 
the left with probability q\ and to the right with probability 
qi- This is equivalent to the meaningful information content 
of a policeman in the crossing who points toward the left with 
probability q \ , to the right with probability q 2 , and never down 
to the left (since it is not on a shortest path to t ). The remaining 
two terms in Eq. represents the cost from the next step to 
the target as two contributions according to Eq. @, weighted 
with the probabilities q\ and qi of choosing the paths. 

We set dS/dgi = to find the minimum. With q 2 = 1 — q\ 
we get 

= log 2 2i -log 2 (l -q\) -\og 2 Pi +log 2 £>2, (6) 
or q\jqi — pi/pz, satisfied by 

<7i = p\/{p\+ pi),q2 = pi/ipi+pi)- (7) 

Inserting this back in Eq. gives 

S(j^t) = -log 2 (y i + ^p 2 \ (8) 

which is identical to Eq. (@}. Effectively q\ and q 2 weight the 
probability of choosing an exit from j with the difficulty of 
following it. For example, paths that contain large hubs will 
be suppressed because the probability of following such paths 
ig 2 rjan^omly is lower. 

To be able to characterize the complete network in terms 
of searchability we define S as the average of pairwise search 
information between nodes over all pairs of nodes 

S=—^ -YS(s~>t). (9) 

Thus, although S is defined in terms of global random walk- 
ers it should be interpreted as subsequent and local minimiza- 
tion of information costs to navigate to a target node. Thus, 
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it is different from the random walker approach that has been 
used to characterize topological features of networks llT7lfT3l . 
including first passage times 1 19], large scale modular fea- 
tures ll20ll . and search utilizing topological features 0. Nei- 
ther should the search information with its logarithm of base 
2 be mixed up with entropy measures associated with the 
degree distribution measures related to the dominating 
eigenvector of the adjacency matrix 1 22], or different flows on 
networks like betweenness centrality and closeness centrality 
ll23l l24l I25IL Instead S measures the amount of information 
that turns a random walker to a directed walker that follows a 
shortest path (or any other chosen path) between the source s 
and target t. 

Some insight into the search information S, which also 
makes the difference from a pure entropy measure clear, is 
obtained if we consider the simple average along one of the 
shortest paths, and ignore information associated with having 
arrived from a link that cannot be leading closer to t : 



S(s,t) 



p(s,t) 



log 2 k, l\kj (10) 



with a total average path information 



1 



N(N— 1 



(11) 



which differs from a pure entropy measure of the form 
Y*pl°EP since b(j) is proportional to kj only when the walk 
is random. Here b(j) is the traffic betweenness of the node j, 
defined as the number of shortest paths between pairs of nodes 
in the network that pass through node j, including paths that 
start at /' or paths that end at 7. This traffic betweenness differs 
from the usual betweenness 123112411 by the different treatment 
of degenerate paths, in the sense that a given degenerate path 
contributes to betweenness with a weight given by the diffi- 
culty of walking the path according to Eq. In practice, 
in all the real networks that we have investigated, we found 
that the difference is negligible. We thus expect relatively 
large S values for networks (1) where there are many nodes 
on the shortest path between other nodes [most b(i) large], and 
(2) where most traffic goes through highly connected nodes. 
Point 1 predicts large S for modular networks, whereas point 
2 suggests relatively large S for networks with broad degree 
distributions. The path length is indirectly coupled to points 1 
and 2; stringy networks as well as regular networks with long 
average path lengths have high S and starlike networks have 
small S despite point 2, because of the very short paths. In 
the remaining part of this paper we will examine the interplay 
between S and global topology in detail. 



III. SEARCH INFORMATION IN MODEL NETWORKS 

The search information is topology dependent, and in this 
section we present how S captures the average degree, the de- 
gree distribution, and higher order topological organization of 
the networks. 
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FIG. 4: (Color online) Search information of random Erdos-Renyi 
(ER) networks as a function of average degree (k) . The number of 
nodes is N = 10 3 and we keep the networks connected. The shaded 
area is the contribution from degenerate paths with the upper edge 
corresponding to the definition of S u in Fig. 0a). The definition in 
Fig.|3jb) is inseparable in this plot from the search information ac- 
cording to Fig.|3jc). The degenerate paths make the highly connected 
networks more searchable, mainly due to degenerate paths of length 
2 between each pair of nodes. Thus the resulting S is lower than 
that obtained when considering information associated to locating 
just one of the shortest paths (upper border of shaded area). For very 
low degrees, (k) ~ 2, the organization of the networks opens for a 
broad range of different topologies with very different searchability; 
the average shortest path increases and finally no degenerate paths 
exist. 



Figure |4] shows how the S depends on average degree (k) 
in a random network. The lower curve is the total S and 
the shaded area represents the contribution from degenerate 
paths. The upper border of the shaded area is consequently 
S u , the search information without degenerate paths. Sj, 
(S u > Sd > S), that weights paths according to branch points 
along the paths, is within the shaded area (although it is indis- 
tinguishable from the lower curve in the present case). Notice 
that the figure mostly examines very high (k) values where 
most pairs of nodes are connected by multiple degenerate 
paths of length 2. This explains the reduction in search in- 
formation due to degenerate paths, which becomes small for 
the real-world networks when (k) is 1-10. For these small (k), 
S depends crucially on the global topological organization: it 
is log 2 2 = 1 for a one-dimensional string, log 2 iV for a star, 
but of order N/4 for a stringy structure with many separated 
branches of length 1 (see Fig. @}. The increase of the aver- 
age shortest path length (I), plotted as a dashed line, indicates 
that the stringy structure dominates in the ensemble of random 
networks with low (k) . 

In Fig. [5] we demonstrate that s = S/log 2 (N) is nearly a 
size-independent way to compare networks of different size 
with each other |5]. Thus this quantity is an invariant for any 
given type of network topology, whether it is dominated by a 
single hub (star), whether it is scale-free (SF), or whether it is 
of Erdos-Renyi (ER) type. In all cases we compare networks 
with the same average degree and find that s nicely differenti- 
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FIG. 5: (Color online) 5 as function of system size N for fully con- 
nected random network topologies with fixed average degree (k) = 5. 
ER refers to Erdos-Renyi random networks and SF to random scale- 
free networks with degree distribution oc l/(kg + k) 2A with ko ad- 
justed for every network size so that (k) is kept fixed. A star network 
with one node connected to the remaining nodes in the network and 
the remaining links randomly distributed scales asymptotically as 
S = log 2 N + 1 as in principle every shortest path goes through the 
hub with the cost \og 2 (N — 1). 
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ates between different types of networks with a given amount 
of links between the nodes. The asymptotic logarithmic scal- 
ing can be understood by the logarithmic increase in average 
shortest path length for Erdos-Renyi networks, (l sp ) °< logN 
j2r3l . and constant cost at every node (\og 2 (k)). For scale-free 
networks it is a little deeper, but simple for the extreme cases. 
For y = 2 the average shortest path length is constant and the 
size of the largest hub in the network scales linearly with the 
system size 1 26] . As almost all shortest paths will go through 
this "superhub" as in a star network, the search information 
is proportional to log 2 Af. For Y > 3 the average shortest path 
length scales as logN and the largest hub is finite, similar to 
Erdos-Renyi networks l27ll . 

From Fig. |3 we also notice that scale-free networks have 
the largest S, at least as long as we consider a random orga- 
nization of the topology. This is because nodes with large 
values of k[ also have large and therefore contributes rel- 
atively more to the overall confusion according to Eq. ( II \\ . 
This fact is explored more in Fig. [6] where we show the vari- 
ation of S/log 2 (Af) as function of degree distribution quanti- 
fied by Y At low Y~2, where effectively a scale-free network 
behaves very similar to a star network, the largest hubs tend 
to be connected to a major fraction of the system. A typical 
path therefore passes through a major hub of degree k°^ N 
and maybe one more node as indicated by the average short- 
est path length. For larger Y the high cost of passing nodes 
with k°^N disappears, but the total average cost nevertheless 
increases since the path length increases rapidly. In Fig.|5Jb) 
the average degree is kept constant by adjusting ko in the de- 
gree distribution P(k) °c (ko + k)~^. This weakens the increase 
in average path length as Y increases and S instead slowly de- 
creases because the probability for having very large hubs de- 
creases. 

We now turn to networks with narrow degree distributions, 
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FIG. 6: (Color online) 5 as a function of the exponent y for random 
scale-free networks with degree distribution P(k) oc 1/0 in (a). Vary- 
ing y implies varying average degree (k) according to the second re- 
label row. An increased y also implies a decreased frequency of large 
hubs and a lengthened average shortest path between nodes. To sep- 
arate the effects we in (b) study the distribution P(k) l/(ko + kj 1 
with kQ set to keep the (k) fixed, as in the insert. Here we see that 
even though the average shortest path increases, S decreases slightly 
due to a decreased frequency of hubs. However, compared to (a) the 
average shortest path increases substantially less with increasing y, 
due to the constant (k) . The increasing average shortest path accord- 
ingly dominates over the decreasing frequency of hubs for y > 2.2 in 
(a). The size of the networks is N = 10 4 in both (a) and (b). 



but nonrandom topologies and start with an illustrative calcu- 
lation of S for a tree hierarchy. We obtain S = 21og(iV) — 5 
numerically for trees of different branching ratios d (Fig.Q, 
which was corroborated analytically for a binary tree. How- 
ever, S depends on addition of links to the tree, and in par- 
ticular S is larger for the club tree, as numerically demon- 
strated in Fig.0 In any case S for trees is much larger than 
for random networks. The reason why trees are perceived as 
efficient is (1) that they are efficient seen from the top (e.g., 
data structures), and (2) trees are mostly associated not to 
specific signaling, but rather to broadcasting of information, 
where everyone in a certain section is given the same infor- 
mation (e.g., military organization). Even higher information 
cost has a regular network (every node connected to twice 
the dimension d of the lattice) as the shortest path length 
scales as N l l d . If the links of the regular network instead 
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of S and very similar results would have been obtained if we 
instead had considered S u and excluded degeneracy or S c / with 
a different weight of the degenerate paths. To extend this we 
show in Fig. |8jb) the deviations between the search informa- 
tion S and a minimum search information S m i„, where we take 
the minimum information concept to the extreme and look for 
the path regardless of length that has the smallest information 
cost. This would typically be a path that avoids hubs. In Fig. 
[8]it is obvious that the right choice is cheaper informationwise 
even though the path is longer. Intuitively the number of short- 
est information paths that also are shortest paths will decay as 
the paths get longer and longer. This is confirmed in Fig. [8^b) 
for a scale-free network of size N= 10 4 withy =2.4. Nev- 
ertheless, the difference from the shortest information path is 
small. This observation is valid in case of logarithmic infor- 
mation cost at every node, as in the present case. If the cost 
instead was linear as in Fig.|2jc), the difference would be sub- 
stantial as hubs would repel the minimum information paths 
much more. 



FIG. 7: (Color online) S versus N for tree, club-tree and modular 
networks. Both trees have a branching ratio d of 4, as seen in the 
illustration above. The modular network consists of communities of 
ten nodes, each of them connected to five other nodes. Each commu- 
nity is in turn connected with three other communities. 



represents street segments between intersections in a square 
city like Manhattan (streets and avenues mapped to nodes 
and intersections to links between the streets and avenues in 
a fully connected bipartite network |5]), the result is com- 
pletely different. Let N streets be divided into N/2 north- 
south (NS) streets, and N/2 east-west (EW) streets. Going 
from any NS street to a particular EW street demands infor- 
mation about which of the N/2 exits is correct. This infor- 
mation cost is S(NS — > EW) = log 2 (iV/2). To go from one 
NS street to another NS street means that any of the N /2 EW 
streets can be chosen. Each path is thus assigned a probabil- 
ity (2/N)[l /(N/2 - 1)]. But there are in fact N/2 degenerate 
paths, and the total information cost for locating parallel roads 
in this square city reduces to 



S(ns 



N 1 



1 



* = - l H2W2m^) =log2(N/2 - l) > 

(12) 

reflecting the fact that it does not matter which of the EW 
roads one uses to reach the target road. This places the fully 
connected bipartite network in the same class as the star net- 
work. 

As an example of typical organization in social systems we 
also show the N dependence of modular networks in Fig. 
lEijll . Again, S is larger than in any random network irrespec- 
tive of the degree distribution. We can therefore extend the 
previous statement that the value of s — S/log 2 (N) is related 
to the global organization principle to include both the degree 
distribution, Fig.|5] and the way the nodes are positioned rel- 
atively to each other, Fig.0 

Again, all results are robust to the details in the formulation 




FIG. 8: (Color online) The minimum information path of length lg m . 
is the path between two nodes that regardless of distance has the 
smallest cost. In (a) it is clear that the right path is cheaper infor- 
mationwise but longer in number of steps, (b) shows the fraction of 
minimum information paths that are also shortest paths in a scale-free 
network with A' = 10 4 nodes and y = 2.4. Although the overall frac- 
tion is as low as 0.62, 5 m ,„ is only 5% smaller than S. As degeneracy 
is not considered in S m i n we compare with S without degeneracy, S u 
as in Fig. |51a). 



IV. NODE ORGANIZATION 

We have until now presented a tool to characterize networks 
on the global level and quantified networks as being easy or 
difficult to navigate or search on average. We now turn to 
the effect the organization of networks has on the individ- 
ual nodes. The specific communication approach opens up a 
natural way to characterize the different networks in terms of 
their ability to distribute communication options among their 
nodes. We therefore define the hide 



^ = £S(^r)/iV 



(13) 



as the average number of bits a walker needs to walk dir ectly 
from a random node in the network to the target node t 11211 . 
The different values of hide reflect to what degree the nodes 
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are visible. Low hide !tf , or low average information cost to 
find the node, represents high visibility. This is illustrated in 
Fig. [9] where in agreement with intuition we find that Erdos- 
Renyi networks are by far the most democratic, whereas scale- 
free and especially tree hierarchies are hugely elitist. In partic- 
ular the tree hierarchy has localized all communication (low 
H means high visibility and thus ability to receive informa- 
tion) to the top nodes. In Fig.|9]we plot the democratic spread 
as the difference between the most and the least visible node 
in the network as an illustrative estimate of the distribution of 
communication in the network. 
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FIG. 9: (Color online) Distribution of hide H for nodes on various 
types of networks (low fC means high visibility). We see that the 
Erdos-Renyi (ER) network is quite homogeneous, the scale-free (SF) 
network has a wider distribution, whereas the tree hierarchy is by far 
the least democratic in distributing the ability of different nodes to 
communicate. The democratic spread plotted below the box roughly 
estimates the division of communication in the network. The number 
of nodes is N = 10 4 for all networks. 

The different degrees of hide information of the various 
nodes effectively rank the nodes, and thereby suggest a self- 
consistent measure of a hierarchy based on visibility. At the 
same time the hide H captures both the hierarchy in the usual 
terms of trees, as in military structures, and the intrinsic hi- 
erarchical nature of topological hierarchies for scale-free net- 
works 1 29] as in the Internet Hal . A highly ranked node is 
close to the top in a tree. The corresponding node in a topo- 
logical hierarchy is a highly connected node. In the Internet, 
for example, the highly connected nodes play the roles of in- 
termediate nodes on typical paths between nodes further down 
in the hierarchy, just like top nodes in a tree. In analogy with 
ll29l.l30ll we define a path from s to t to be hierarchical if it de- 
fines a common boss for s and t . That is, the path has first to 
decrease monotonically in ttj to more and more visible nodes, 
until a minimum, and thereafter increase monotonically in H> 
until the target node j — t is reached. We allow the path to 



pass between nodes with the same value of Hj, and we con- 
sider paths that only increase or only decrease as hierarchical. 
Given H\ for each node j £ [1 , N] in a network we quantify the 
network's degree of information hierarchy f. M by the fraction 
of shortest paths between nodes in the network which are also 
hierarchical paths: 
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(number of hierarchical shortest paths) 

N(N-l) : 



(14) 



where the denominator counts the total number of shortest 
paths between nodes in the network. In case of degenerate 
shortest paths, each path contributes to f H by a weight given 
by its contribution to the traffic betweenness. In accordance 
with intuition we find that decays with system size for 
random Erdos-Renyi networks, as shortest paths get longer. 
The decay is plotted in the inset of Fig. ^| = 1 for 
both hierarchies and club hierarchies whereas for random 
scale-free networks depends on degree distribution. FigurefTQl 
shows how the information hierarchy varies with degree dis- 
tribution for pure random scale-free networks parametrized by 
P(k) oc As y increases from 2 the network goes from 

being a complete information hierarchy with = 1 toward 
when y approaches 3, the average degree approaches 2, and 
shortest paths become long. For real-world networks the over- 
all observation is that biological networks are antihierarchical 
with respect to J, H , while social and communication networks 
tend to be hierarchical (see table in Fig. llOt . The Internet is 
a network of autonomous systems 1 3 1 ] that in this data set 
consists of 6474 nodes and 12 572 links and its degree distri- 
bution is scale-free with P(k) « 1 /k 2 . In the CEO network 
(6193 nodes and 43 074 links), chief executive officers are 
connected by links if they sit on the same board 1 32]. The city 
network is constructed by mapping 4127 streets to nodes and 
5565 intersections to links between the nodes in the Swedish 
city of Stockholm (§L l33ll . Fly is the protein interaction net- 
work in Drosophilia melanogaster detected by the two-hybrid 
experiment 1 34-], and yeast refers to the similar network in 
Saccharomyces cerevisiae 1 35 ] . 

Overall, for scale-free networks, the information hierarchy 
quantitatively follows the topological hierarchy J pre- 
sented in 1 29]. Thus networks with maximal (minimal) topo- 
logical hierarchy f 1 29], also have large (small) J, K . But it 
is important that the information hierarchy allows for a natu- 
ral generalization to non-scale-free networks, and is therefore 
a unified definition of hierarchical organization with the most 
visible node in the top. A less powerful ranking is the be- 
tweenness 1 24] as the betweenness is sensitive to links that 
shortcut important nodes. By adding links between the chil- 
dren of a top node as in the club tree in Fig. [7\ the ranking 
changes completely as the betweenness for the top node in 
principle would be zero, whereas its position at the top would 
still be reflected by the hide ranking. 



V. CONCLUSION 

Networks are a natural way to visualize the limited infor- 
mation access experienced by individual parts of the overall 
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FIG. 10: (Color online) The hierarchical features of scale-free and 
Erdos-Renyi networks with hide 91 representing the importance of 
a node. The nodes in the networks in the top of the figure are ar- 
ranged in hierarchical order. The intuitively hierarchical order of the 
tree is reproduced with the 01 ordering and all shortest paths are hi- 
erarchical, J , H — 1 . The network to the right has the opposite prop- 
erty. A high ratio of the shortest paths are not hierarchical since a 
typical path repeatedly goes up and down in the H hierarchy. The 
table shows a number of real-world networks and their <J , K together 
with the corresponding value in randomized networks with the same 
degree sequence 11611 , The biological networks are randomized 
so that both bait and prey degree of all proteins are preserved. The 
plot in the bottom of the figure shows the behavior of scale-free and 
Erdos-Renyi networks with respect to J H . Scale-free networks are y 
dependent whereas Erdos-Renyi networks are size dependent. 



is valid for all investigated real-world networks j5l [3 03 
[3- Here 5 (random, fixed degree) represents randomized net- 
works with exactly the same degree distribution as the inves- 
tigated real-world network, whereas ER (Erdos-Renyi) net- 
works only have the same total number of nodes and links 
as the real-world network. The above inequality is in partic- 
ular associated with cases where the cost of passing a node 
is proportional to log(fc), but it is also true for the higher lo- 
cal information cost proportional to k, where k is the degree 
of the node. As S represents an average of the contribution 
from any node to any other node, the major contribution to 
S comes from pairs of nodes that are separated by large dis- 
tances. The fact that S in realistic networks is relatively large 
teaches us that the topology of real-world networks disfavors 
distant specific communication II 1 3ll 1 411 . Topologically, large 
S was found in a number of model networks, with modular or 
hierarchical features with highly connected nodes deliberately 
positioned "between" other nodes, hinting that a large search 
information S is associated not only with broad degree distri- 
butions, but also with well known organizational features of 
social and biological systems. 

The peer-to-peer search information S(s — » t) opens the 
possibility for a detailed measure of the relative "importance" 
of nodes in a given network. In fact, measuring visibility of a 
node t in terms of how well hidden the node is from the rest of 
the network as in Eq. d 1 3I > . we have shown how networks can 
be ranked in terms of a generalized hierarchy measure. The 
measure captures both the hierarchy in the usual terms of trees 
shown in Fig.|5]and at the same time also the intrinsic topolog- 
ical hierarchical nature of scale-free networks. Thus, this gen- 
eralized hierarchy measure defines scale-free networks with 
degree distribution with exponent close to y = 2 to be hierar- 
chical, whereas narrower distributions will be antihierarchical 
unless they are deliberately organized in a treelike structure. 

Overall, the different ways of organizing networks can be 
recast according to their ability or inability to transmit specific 
messages across the networks. The presented search informa- 
tion S provides a useful measure of this key functional role 
that is reflected in the topology of many real-world networks. 



system. In the present paper we have explored topologies of 
a number of model networks in terms of their ability to fa- 
cilitate peer-to-peer communication. The ability to transmit 
specific signals is quantified in terms of the difficulty in nav- 
igating the networks, quantified by the search information S. 
As an overall lesson we have found that the inequality 

5 (real world) > 5 (random, fixed degree) > S(ER), (15) 
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